Low-Rank Approximation of Weighted Tree Automata

نویسندگان

  • Guillaume Rabusseau
  • Borja Balle
  • Shay B. Cohen
چکیده

We describe a technique to minimize weighted tree automata (WTA), a powerful formalism that subsumes probabilistic context-free grammars (PCFGs) and latent-variable PCFGs. Our method relies on a singular value decomposition of the underlying Hankel matrix defined by the WTA. Our main theoretical result is an efficient algorithm for computing the SVD of an infinite Hankel matrix implicitly represented as a WTA. We provide an analysis of the approximation error induced by the minimization, and we evaluate our method on real-world data originating in newswire treebank. We show that the model achieves lower perplexity than previous methods for PCFG minimization, and also is much more stable due to the absence of local optima.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Some improvements of the spectral learning approach for probabilistic grammatical inference

Spectral methods propose new and elegant solutions in probabilistic grammatical inference. We propose two ways to improve them. We show how a linear representation, or equivalently a weighted automata, output by the spectral learning algorithm can be taken as an initial point for the Baum Welch algorithm, in order to increase the likelihood of the observation data. Secondly, we show how the inf...

متن کامل

Minimizing Deterministic Weighted Tree Automata

The problem of efficiently minimizing deterministic weighted tree automata (wta) is investigated. Such automata have found promising applications as language models in Natural Language Processing. A polynomial-time algorithm is presented that given a deterministic wta over a commutative semifield, of which all operations including the computation of the inverses are polynomial, constructs an eq...

متن کامل

Approximation in Description Logics: How Weighted Tree Automata Can Help to Define the Required Concept Comparison Measures in FL_0

Recently introduced approaches for relaxed query answering, approximately defining concepts, and approximately solving unification problems in Description Logics have in common that they are based on the use of concept comparison measures together with a threshold construction. In this paper, we will briefly review these approaches, and then show how weighted automata working on infinite trees ...

متن کامل

Approximation in Description Logics: How Weighted Tree Automata Can Help

Recently introduced approaches for relaxed query answering, approximately defining concepts, and approximately solving unification problems in Description Logics have in common that they are based on the use of concept comparison measures together with a threshold construction. In this paper, we will briefly review these approaches, and then show how weighted automata working on infinite trees ...

متن کامل

Series, Weighted Automata, Probabilistic Automata and Probability Distributions for Unranked Trees

We study tree series and weighted tree automata over unranked trees. The message is that recognizable tree series for unranked trees can be defined and studied from recognizable tree series for binary representations of unranked trees. For this we prove results of [1] as follows. We extend hedge automata – a class of tree automata for unranked trees – to weighted hedge automata. We define weigh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016